Goto

Collaborating Authors

 ethical theory


Advancing Automated Ethical Profiling in SE: a Zero-Shot Evaluation of LLM Reasoning

Migliarini, Patrizio, Memon, Mashal Afzal, Autili, Marco, Inverardi, Paola

arXiv.org Artificial Intelligence

Abstract--Large Language Models (LLMs) are increasingly integrated into software engineering (SE) tools for tasks that extend beyond code synthesis, including judgment under uncertainty and reasoning in ethically significant contexts. We present a fully automated framework for assessing ethical reasoning capabilities across 16 LLMs in a zero-shot setting, using 30 real-world ethically charged scenarios. Each model is prompted to identify the most applicable ethical theory to an action, assess its moral acceptability, and explain the reasoning behind their choice. Responses are compared against expert ethicists' choices using inter-model agreement metrics. Our results show that LLMs achieve an average Theory Consistency Rate (TCR) of 73.3% and Binary Agreement Rate (BAR) on moral acceptability of 86.7%, with interpretable divergences concentrated in ethically ambiguous cases. A qualitative analysis of free-text explanations reveals strong conceptual convergence across models despite surface-level lexical diversity. These findings support the potential viability of LLMs as ethical inference engines within SE pipelines, enabling scalable, auditable, and adaptive integration of user-aligned ethical reasoning. Our focus is the Ethical Interpreter component of a broader profiling pipeline: we evaluate whether current LLMs exhibit sufficient interpretive stability and theory-consistent reasoning to support automated profiling. Autonomous systems are increasingly becoming an integral part of our daily lives across diverse domains [1], [2]. These systems can operate independently without any human intervention and make decisions acting on behalf of their users [3]-[6]. Their rapid growth brings both opportunities and challenges. From a software engineering perspective, as these systems become pervasive, a key challenge is designing systems that, beyond meeting technical requirements, also account for ethical considerations [7]-[11]. Recently, various studies have focused on the ethical implications of these software-intensive systems on individuals and society [10], [12]-[15]. Software engineering ethics encompasses principles and rules that guide engineers' decisions throughout the design and development process [16]. V arious approaches have also been introduced that ensure that systems align with broad ethical values like fairness, transparency, and safety [17]-[22].


Kaleidoscope Gallery: Exploring Ethics and Generative AI Through Art

Issak, Alayt, Narayan, Uttkarsh, Srinivasan, Ramya, Kleinman, Erica, Harteveld, Casper

arXiv.org Artificial Intelligence

Ethical theories and Generative AI (GenAI) models are dynamic concepts subject to continuous evolution. This paper investigates the visualization of ethics through a subset of GenAI models. We expand on the emerging field of Visual Ethics, using art as a form of critical inquiry and the metaphor of a kaleidoscope to invoke moral imagination. Through formative interviews with 10 ethics experts, we first establish a foundation of ethical theories. Our analysis reveals five families of ethical theories, which we then transform into images using the text-to-image (T2I) GenAI model. The resulting imagery, curated as Kaleidoscope Gallery and evaluated by the same experts, revealed eight themes that highlight how morality, society, and learned associations are central to ethical theories. We discuss implications for critically examining T2I models and present cautions and considerations. This work contributes to examining ethical theories as foundational knowledge that interrogates GenAI models as socio-technical systems.


Specifying Agent Ethics (Blue Sky Ideas)

Dennis, Louise A., Fisher, Michael

arXiv.org Artificial Intelligence

We consider the question of what properties a Machine Ethics system should have. This question is complicated by the existence of ethical dilemmas with no agreed upon solution. We provide an example to motivate why we do not believe falling back on the elicitation of values from stakeholders is sufficient to guarantee correctness of such systems. We go on to define two broad categories of ethical property that have arisen in our own work and present a challenge to the community to approach this question in a more systematic way.


Moral Uncertainty and the Problem of Fanaticism

Szabo, Jazon, Such, Jose, Criado, Natalia, Modgil, Sanjay

arXiv.org Artificial Intelligence

While there is universal agreement that agents ought to act ethically, there is no agreement as to what constitutes ethical behaviour. To address this problem, recent philosophical approaches to `moral uncertainty' propose aggregation of multiple ethical theories to guide agent behaviour. However, one of the foundational proposals for aggregation - Maximising Expected Choiceworthiness (MEC) - has been criticised as being vulnerable to fanaticism; the problem of an ethical theory dominating agent behaviour despite low credence (confidence) in said theory. Fanaticism thus undermines the `democratic' motivation for accommodating multiple ethical perspectives. The problem of fanaticism has not yet been mathematically defined. Representing moral uncertainty as an instance of social welfare aggregation, this paper contributes to the field of moral uncertainty by 1) formalising the problem of fanaticism as a property of social welfare functionals and 2) providing non-fanatical alternatives to MEC, i.e. Highest k-trimmed Mean and Highest Median.


Values, Ethics, Morals? On the Use of Moral Concepts in NLP Research

Vida, Karina, Simon, Judith, Lauscher, Anne

arXiv.org Artificial Intelligence

With language technology increasingly affecting individuals' lives, many recent works have investigated the ethical aspects of NLP. Among other topics, researchers focused on the notion of morality, investigating, for example, which moral judgements language models make. However, there has been little to no discussion of the terminology and the theories underpinning those efforts and their implications. This lack is highly problematic, as it hides the works' underlying assumptions and hinders a thorough and targeted scientific debate of morality in NLP. In this work, we address this research gap by (a) providing an overview of some important ethical concepts stemming from philosophy and (b) systematically surveying the existing literature on moral NLP w.r.t. their philosophical foundation, terminology, and data basis. For instance, we analyse what ethical theory an approach is based on, how this decision is justified, and what implications it entails. Our findings surveying 92 papers show that, for instance, most papers neither provide a clear definition of the terms they use nor adhere to definitions from philosophy. Finally, (c) we give three recommendations for future research in the field. We hope our work will lead to a more informed, careful, and sound discussion of morality in language technology.


Uncertain Machine Ethical Decisions Using Hypothetical Retrospection

Kolker, Simon, Dennis, Louise, Pereira, Ramon Fraga, Xu, Mengwei

arXiv.org Artificial Intelligence

We propose the use of the hypothetical retrospection argumentation procedure, developed by Sven Ove Hansson to improve existing approaches to machine ethical reasoning by accounting for probability and uncertainty from a position of Philosophy that resonates with humans. Actions are represented with a branching set of potential outcomes, each with a state, utility, and either a numeric or poetic probability estimate. Actions are chosen based on comparisons between sets of arguments favouring actions from the perspective of their branches, even those branches that led to an undesirable outcome. This use of arguments allows a variety of philosophical theories for ethical reasoning to be used, potentially in flexible combination with each other. We implement the procedure, applying consequentialist and deontological ethical theories, independently and concurrently, to an autonomous library system use case. We introduce a preliminary framework that seems to meet the varied requirements of a machine ethics system: versatility under multiple theories and a resonance with humans that enables transparency and explainability.


Towards Theory-based Moral AI: Moral AI with Aggregating Models Based on Normative Ethical Theory

Takeshita, Masashi, Rafal, Rzepka, Araki, Kenji

arXiv.org Artificial Intelligence

Moral AI has been studied in the fields of philosophy and artificial intelligence. Although most existing studies are only theoretical, recent developments in AI have made it increasingly necessary to implement AI with morality. On the other hand, humans are under the moral uncertainty of not knowing what is morally right. In this paper, we implement the Maximizing Expected Choiceworthiness (MEC) algorithm, which aggregates outputs of models based on three normative theories of normative ethics to generate the most appropriate output. MEC is a method for making appropriate moral judgments under moral uncertainty. Our experimental results suggest that the output of MEC correlates to some extent with commonsense morality and that MEC can produce equally or more appropriate output than existing methods.


WizardLM: Empowering Large Language Models to Follow Complex Instructions

Xu, Can, Sun, Qingfeng, Zheng, Kai, Geng, Xiubo, Zhao, Pu, Feng, Jiazhan, Tao, Chongyang, Jiang, Daxin

arXiv.org Artificial Intelligence

Training large language models (LLMs) with open-domain instruction following data brings colossal success. However, manually creating such instruction data is very time-consuming and labor-intensive. Moreover, humans may struggle to produce high-complexity instructions. In this paper, we show an avenue for creating large amounts of instruction data with varying levels of complexity using LLM instead of humans. Starting with an initial set of instructions, we use our proposed Evol-Instruct to rewrite them step by step into more complex instructions. Then, we mix all generated instruction data to fine-tune LLaMA. We call the resulting model WizardLM. Human evaluations on a complexity-balanced test bed and Vicuna's testset show that instructions from Evol-Instruct are superior to human-created ones. By analyzing the human evaluation results of the high complexity part, we demonstrate that outputs from our WizardLM are preferred to outputs from OpenAI ChatGPT. In GPT-4 automatic evaluation, WizardLM achieves more than 90\% capacity of ChatGPT on 17 out of 29 skills. Even though WizardLM still lags behind ChatGPT in some aspects, our findings suggest that fine-tuning with AI-evolved instructions is a promising direction for enhancing LLMs. Our code and data are public at https://github.com/nlpxucan/WizardLM


Automated Kantian Ethics: A Faithful Implementation

Singh, Lavanya

arXiv.org Artificial Intelligence

As we grant artificial intelligence increasing power and independence in contexts like healthcare, policing, and driving, AI faces moral dilemmas but lacks the tools to solve them. Warnings from regulators, philosophers, and computer scientists about the dangers of unethical artificial intelligence have spurred interest in automated ethics--i.e., the development of machines that can perform ethical reasoning. However, prior work in automated ethics rarely engages with philosophical literature. Philosophers have spent centuries debating moral dilemmas so automated ethics will be most nuanced, consistent, and reliable when it draws on philosophical literature. In this paper, I present an implementation of automated Kantian ethics that is faithful to the Kantian philosophical tradition. I formalize Kant's categorical imperative in Dyadic Deontic Logic, implement this formalization in the Isabelle/HOL theorem prover, and develop a testing framework to evaluate how well my implementation coheres with expected properties of Kantian ethic. My system is an early step towards philosophically mature ethical AI agents and it can make nuanced judgements in complex ethical dilemmas because it is grounded in philosophical literature. Because I use an interactive theorem prover, my system's judgements are explainable.


Why aligning AI to our values may be harder than we think

#artificialintelligence

Plenty of scientists, philosophers, and science fiction writers have wondered how to keep a potential super-human AI from destroying us all. While the obvious answer of "unplug it if it tries to kill you" has many supporters (and it worked on the HAL 9000), it isn't too difficult to imagine that a sufficiently advanced machine would be able to prevent you from doing that. Alternatively, a very powerful AI might be able to make decisions too rapidly for humans to review for ethical correctness or to correct for the damage they cause. The issue of keeping a potentially super-human AI from going rogue and hurting people is called the "control problem," and there are many potential solutions to it. One of the more frequently discussed is "alignment" and involves syncing AI to human values, goals, and ethical standards.